Genetic Diversity and Relationship of Shanlan Upland Rice Were Revealed Based on 214 Upland Rice SSR Markers

Shanlan upland rice (Oryza sativa L.) is a unique upland rice variety cultivated by the Li nationality for a long time, which has good drought resistance and high utilization value in drought resistance breeding. To explore the origin of Shanlan upland rice and its genetic relationship with upland rice from other geographical sources, 214 upland rice cultivars from Southeast Asia and five provinces (regions) in southern China were used to study genetic diversity by using SSR markers. Twelve SSR primers were screened and 164 alleles (Na) were detected, with the minimum number of alleles being 8 and the maximum number of alleles being 23, with an average of 13.667. The analysis of genetic diversity and analysis of molecular variance (AMOVA) showed that the differences among the materials mainly came from the individuals of upland rice. The results of gene flow and genetic differentiation revealed the relationship between the upland rice populations, and Hainan Shanlan upland rice presumably originated from upland rice in Guangdong province, and some of them were genetically differentiated from Hunan upland rice. It can be indirectly proved that the Li nationality in Hainan is a descendant of the ancient Baiyue ethnic group, which provides circumstantial evidence for the migration history of the Li nationality in Hainan, and also provides basic data for the advanced protection of Shanlan upland rice, and the innovative utilization of germplasm resources.


Introduction
Upland rice, also known as aerobic rice, is an ecological type of rice [1]. In areas where water layers are difficult to preserve or rice fields are prone to drought, long-term natural selection has led to the gradual evolution of rice varieties adapted to dryland cultivation [2]. Upland rice generally has a well-developed root system and other unique drought-resistant mechanisms [3] that can be cultivated in completely upland conditions, so it is more drought-tolerant than irrigated rice [4,5]. Shanlan upland rice (Oryza sativa L.) is an ancient local upland rice variety in Hainan province (China). It has the advantages of a short growth period, less water requirement, drought tolerance, and barren tolerance [6,7]. Most of the rice produced by Shanlan upland rice is japonica-glutinous rice with good quality and great taste [8]. More than 85% of the subspecies of Shanlan upland rice was of japonica type, which was closely related to the common wild rice in Guangdong (China) and Hunan (China). However, it is less closely related to the common wild rice in Hainan (China) and therefore has been speculated that it may originate from common wild rice in Guangdong (China) and Hunan (China) [9]. As the original parent material of rice breeding for drought resistance, Shanlan upland rice has highly elevated utilization value.
In rice cross breeding, the development and utilization of favorable gene sources of Shanlan upland rice are worthy of full attention [10]. Upland rice has different adaptations and mechanisms in drought resistance compared to rice [11]. Therefore, hybrids may show greater drought resistance and adaptability. This makes the hybridization of rice and upland rice an essential genetic improvement mode [12,13] that can provide more options and support for agricultural production in arid areas. Therefore, the studying the genetic diversity and the genetic relationship of Shanlan upland rice can be used to search for better drought-resistant lines for hybrid rice breeding.
Simple sequence repeat (SSR) is a sequence of 1~6 nucleotides. Because of its high speed and specificity, it is often used as a rapid diagnostic tool [14]. Hinge et al. use SSR molecular markers as specific tags for banana species identification, which provides insights for further verification of the consistency of plant populations [15]. In recent years, SSR molecular markers have been widely used in animal and plant population analysis and plant genetic breeding [16][17][18]. Based on SSR molecular marker technique, Jasim et al. investigated the genetic diversity of aromatic rice germplasm. The results showed that 89% of the total variation of the germplasm came from within the population, while 11% of the variation came from the population [19]. Hassan et al. used 37 SSR markers to analyze the genetic diversity of 62 rice accessions in Kurdistan regions, indicating that 72% of the differences occurred between indica and japonica populations [20]. Liu et al. analyzed the genetic diversity of 1481 individuals using SSR markers and SNP haplotypes [21]. In addition, SSR also plays an essential role in the study of the genetic diversity of cowpea [22], barley [23], beans [24,25], maize [26], and other crops. Therefore, it is feasible to screen specific SSR primers for genetic diversity analysis of Shanlan upland rice to explore its geographical origin.
Li nationality is now recognized as the first ethnic group to move to Hainan (China). Scholars believe that the Li nationality ethnic group in Hainan originated in the northern mainland or Southeast Asia, but the most influential, and most scholars agree, is that the Li nationality came from the ancient Baiyue ethnic group. The primitive agriculture of Hainan Island began about 10,000 years ago in the Neolithic age. Modern archaeology can prove that the earliest residents of Hainan Island, the people of Luobi Cave, lived in Sanya (Hainan, China), where common wild rice, the ancestor of cultivated rice was distributed, but archaeological excavations have not found any evidence of rice remains. At present, the researches have shown that the relationship between Hainan Shanlan upland rice and Hainan common wild rice is relatively distantly, and Shanlan upland rice is not originating from the common wild rice in Hainan (China), but should be introduced from outside Hainan Island. The emergence and expansion of rice farming in China originated from the continuous migration of ancient Baiyue ethnic groups and "The rice growing techniques, rice culture, and rice seeds spread together" [27]. So it is worth studying where Shanlan upland rice was introduced to Hainan Island. Therefore, we selected the original upland rice resources in the surrounding provinces of Hainan and Southeast Asia, and studied the origin and differentiation of upland rice and Shanlan upland rice by analyzing the genetic diversity and genetic relationship between upland rice and Shanlan upland rice in different geographical regions, to determine the possible geographical source of Shanlan upland rice in Hainan, providing basic data for further protecting Shanlan upland rice as an essential upland rice resource, as well as the innovative utilization of germplasm resources, and better excavation and utilization of Shanlan upland rice resources.

Development of Polymorphic SSR Primers
We screened 48 pairs of primers from the rice database by gel electrophoresis, and 12 SSR primers were obtained for genetic diversity analysis (Table 1). A total of 164 alleles (Na) were detected at 12 SSR loci, with Na distribution ranging from 8 (LRJ89) to 23 (LRJ62) at each locus with an average Na of 13.667. The number of effective alleles (Ne) varied

Gene Flow among Upland Rice Populations from Different Geographical Origin
We analyzed the genetic diversity of 14 varieties of different geographic origins (Table 2), and the results showed that all populations of different geographic origins had gene flow greater than 1, indicating that the level of gene exchange among populations was relatively high between different geographic populations, which might be caused by some artificial activities such as variety selection and introduction exchange among various groups. For example, the level of gene exchange between the Philippines and Malaysia is relatively high, indicating that inter-introduction between the two regions has led to gene exchange between the two different populations; the lowest levels of gene exchange between Guizhou (China) and Vietnam, which may be that geographic distance and less human activity between the two places, resulting in lower level of gene exchange between the two different populations. In the lower triangles of the table, the Fst of the germplasm resources in Vietnam and other areas were particularly elevated, ranging from 0.071 to 0.119, indicating extreme genetic differentiation between these populations. The Fst of Hainan (China), and Hunan (China) was the lowest among inland germplasm (0.038), followed by Guangdong (China), Guangxi (China) and Guangdong (China) with an Fst of 0.032; the Fst of germplasm resources in five provinces of southern China ranged from 0.032 to 0.064. The results of minimal Fst and minimal genetic distance were consistent.

Genetic Diversity of Upland Rice from Different Geographical Sources
We analyzed the genetic diversity of 214 materials from 14 different geographical sources ( Table 3). The results showed that Laos had the most Na (7.917) and Vietnam had the least Na (2.333). The Philippines has a maximum Ne of 4.512 and Vietnam has a minimum Ne of 1.952. Indonesia has the highest NE (1.625), indicating higher biodiversity in this region. The Ho (0.012) and He (0.713) of Myanmar were the lowest, indicating that the genetic variation was relatively large. In Hainan (China), Na (6.583), Ne (3.008), and I (1.268), were the lowest among 14 regions, Ho and He were 0.076 and 0.630, respectively, which was the intermediate level overall. Fst is an index of population differentiation determined by population genetic structure [28], the range of values is 0~1. A maximum value of 1 that there is complete genetic differentiation among populations, and 0, while 0 indicates no differentiation between populations. All Fst values in the table are relatively high (>0.7), indicating a significant degree of genetic differentiation between populations.

Genetic Similarity Analysis of Upland Rice from Different Geographical Sources
The Nei's genetic distance between populations was calculated in POPGEN 1.3.2 [29]. The results ( Figure 1A) showed that the genetic distance between populations ranged from 0.160 to 0.437, with an average genetic distance of 0.258, indicating that the genetic basis of germplasm resources was relatively abundant. Among the 14 populations, the genetic distance between Guizhou (China) and Vietnam was the largest (0.437), indicating that the two germplasm were distant from each other, while the genetic distance between the two populations was the smallest in Guangdong (China) and Guangxi (China) (0.160), indicating that the two germplasms were closely related. The genetic distance between Hainan (China) and Guangdong (China) was the closest at 0.165, followed by Guangxi (China) at 0.166 and Guizhou (China) at 0.177. Cluster analysis using Nei's genetic distance-based unweighted group average method (UPGMA) ( Figure 1B) showed that Guangxi (China) and Guangdong (China) were the most closely clustered of the 14 regional populations. The results indicated that Shanlan upland rice in Hainan (China) was closely related to upland rice in Guangdong (China) and Guangxi (China). The genetic distance analysis between Hainan (China) and Guangxi (China), Guangdong (China), Hunan (China), and Guizhou (China) showed that the genetic distance between Hainan (China) and Hunan (China) was 0.191, the genetic distances between Hunan (China) and Guangxi (China), Guangdong (China) and Guizhou (China) was between 0.2 and 0.3, indicating that the genetic relationship between Shanlan upland rice in Hainan (China) and upland rice in Hunan (China) was also closely related. The genetic distance of Hainan (China), Myanmar, and Vietnam was the most distant, with the genetic distance of 0.321 and 0.320, respectively.

Analysis of Molecular Variance
We used the AMOVA tool to explore genetic variation in 14 upland rice populations. The results showed that 4.34% of genetic variation existed between populations and about 95% of genetic variation existed in 214 upland rice germplasm. There was small genetic differentiation between populations (0.043, p < 0.001), which indicated that the gene mobility between populations was higher. The higher population Fis (0.871) indicated a lower genetic diversity within a population. In general, individual variation was the main source of total variation in upland rice samples (Table 4).

Cluster Analysis and Principal Coordinate Analysis of Upland Rice from Different Geographical Sources
To further analyze the genetic relationship among the populations, UPGMA clustering of the Nei's genetic distance revealed that all populations were divided into two major branches (Figure 2A). The 214 materials were clustered into two major groups. The first group (green branches) included 30 Shanlan upland rice, of which 17 samples were clustered together separately, indicating that they were close or homologous. The genetic distances of 214 materials were analyzed by principal coordinate analysis (PCoA). The variation rates of Nei's in horizontal and vertical coordinates were 17.68% and 7.44%, respectively. The two principal axes explained 25.12% of the total genetic variation. The materials were divided into two groups by PCoA ( Figure 2B). The part of Shanlan upland rice in the first group was separated from upland rice, indicating that the genetic distance between Shanlan upland rice and upland rice was far, and the relationship between Shanlan upland rice was close or homologous. Another part of upland rice group includes Shanlan upland rice, indicating that the genetic distance between Shanlan upland rice and upland rice in this group is relatively close, which may be the result of inter-regional introduction. The UPGMA clustering results were consistent with the principal coordinate analysis, and all materials were classified into two groups.

Analysis of Molecular Variance
We used the AMOVA tool to explore genetic variation in 14 upland rice populations. The results showed that 4.34% of genetic variation existed between populations and about 95% of genetic variation existed in 214 upland rice germplasm. There was small genetic differentiation between populations (0.043, p < 0.001), which indicated that the gene mobility between populations was higher. The higher population Fis (0.871) indicated a lower

Discussion
As a representative of the traditional farming culture of the ethnic minorities in Hainan, Shanlan upland Rice is an important genetic resource for food and agriculture. It is of great significance to study the origin and differentiation of Shanlan upland rice for species conservation and utilization. SSR is closely related to species evolution [30] and has

Discussion
As a representative of the traditional farming culture of the ethnic minorities in Hainan, Shanlan upland Rice is an important genetic resource for food and agriculture. It is of great significance to study the origin and differentiation of Shanlan upland rice for species conservation and utilization. SSR is closely related to species evolution [30] and has become a powerful tool for studying species evolution and genetic variation [31]. In this study, we screened 12 SSR loci for genetic diversity analysis in 214 materials. The mean values of He and PIC of 12 loci were 4.297 and 0.703, respectively, and the polymorphism of these loci were higher than that research of Jasim et al. [19] in the study of genetic diversity of aromatic rice. Our mean values of Na (13.667) and I (1.728) were higher than those reported by Yang et al. [32] in Shanlan upland rice. PIC was able to measure the polymorphism of primers, and our PIC was higher than reported in other rice [33,34]. In general, our loci are highly polymorphic and can be used to study the genetic diversity of upland rice resources.
In the AMOVA-based analysis, we found that genetic diversity of 14 populations was mainly inter-individual. Environmental factors can influence the genetic variation of crops. Upland rice is a kind of cultivated rice that can adapt to drought stress and aerobic conditions. It is a crop which has evolved by long-term selection in dry land without water layer [7]. Crops evolved by long-term selection in drylands without aquifers. Environmental conditions affect plant growth and reproduction, affecting its genetic variation and adaptability [35,36]. The 214 materials were all upland rice, so the same environmental conditions, such as drought, may prompt them to evolve in the same direction, so there wasn't much difference between the two groups. However, since rice is a self-pollinating crop, even the same variety under the same geographical conditions may produce different traits, and the differences in these traits may be gradually enlarged in long-term environmental effects or artificial selection, resulting in large differences between individuals. Therefore, in the process of rice genetic improvement, we should pay more attention to the excellent individuals in the population than to the selection of the population.
The study of genetic diversity can accurately reveal the evolutionary history of different species or the same population, allowing us to analyze and compare in greater depth the specific potential between different groups of different or the same species and the evolutionary direction of species or populations [37]. In our results, Hainan (China) and Hunan (China) had the largest Nm (6.343), followed by Guangdong (China) (5.985). Hainan (China) and Hunan (China) had the smallest coefficient of genetic differentiation (0.038), followed by Guangdong (China) (0.040). Genetic distance: Hainan (China) and Guangdong (China) were recently 0.165, followed by Guangxi (China) (0.166) and Guizhou (China) (0.177). The genetic distance between Hainan (China) and Hunan (China) was 0.191. In general, Shanlan upland rice seems to be more closely related to the resources in Guangdong (China) and Hunan (China). Guangdong (China) is the closest province to Hainan (China) in terms of geographical location. The clustering results of 214 materials were divided into two types: part of Shanlan upland rice was a single cluster and the remaining was a dispersed cluster. The analysis of genetic diversity showed that the genetic diversity of Hainan (China) Shanlan upland rice was at the middle level, this means that there are large genetic differences between individuals in a population. In addition, the analysis of molecular variance also showed that the 214-score material was mainly individual differences. These results indicated that different individuals in the Shanlan upland rice population were affected differently by external factors. It is mainly upland rice from Guangdong and Hunan (China).
Baiyue refers to the area where the ancient Yue people were distributed in the coastal area of southern China in ancient times. "The Li nationality in Hainan originated from the Baiyue people", that is, Li nationality was split from one of the Baiyue's ethnic groups, which is widely supported by Chinese classical philologists, archaeologists, geneticists, and anthropologists, and ethnologists [38]. However, there is no ancient written record of this claim, and there is also a lack of archaeological evidence. To this end, we selected the original upland rice resources from the surrounding provinces of Hainan (China) and Southeast Asia, and analyzed the genetic diversity and genetic relationship between upland rice and Shanlan upland rice, to determine the possible geographical origin of Shanlan upland rice in Hainan (China). Through the analysis of the genetic differentiation coefficient and genetic distance, we found that Shanlan upland rice from Hainan (China) is closely related to Guangdong (China) and Hunan (China) upland rice, however, the relationship between Hainan (China) and Hunan (China) is far from relationship with other countries in Southeast Asia, and the source of Hainan (China) is possibly from Guangdong (China) and Hunan (China). Guangdong is the province geographically closest to Hainan, and the southern Hunan upland rice growing area in southern China borders Guangdong and Guangxi, belonging to Baiyue (Nanyue). It can be indirectly proved that the Li nationality in Hainan (China) originated from a branch of Baiyue rather than from Southeast Asia, which provides circumstantial evidence for the migration history of the Li nationality people in Hainan (China).

Experimental Materials and DNA Extraction
A total of 214 materials were selected for the study, of which 55 were Shanlan upland rice of Hainan (China) and 159 from other geographical upland rice (Supplementary File S001). The seeds were sown in the experimental field of "Changshuiyang base", Lingshui, Hainan (China), located at latitude 18 • 29 39.70 N and longitude 110 • 02 30.95 E, belonging to the tropical monsoon island-type climate. The fresh leaves of 3 randomly selected plants were stored in liquid nitrogen for quick freezing, DNA was extracted from leaves with a magnetic beads kit (NMG2611-96, Wuhan Nano Magnetic Biotechnology Co., Ltd., Wuhan, China) and stored at −20 • C. The concentration and purity of DNA were detected with NanoDROP 8000 photometer.

Primer Screening and Genetic Diversity Analysis
Forty-eight pairs of primers were selected from the rice database and 21 BP (GAAG-GTGACCAAGTTCATGCT) linker sequence was added to the upstream sequence of the primers. To screen primers, 17 varieties from 15 regions were used for PCR amplification, and 12 polymorphic primers were selected for population typing of 214 rice (Supplementary File S002). Then we performed fluorescent PCR, diluted the product to 10-20 ng/uL, and configured the machine system (1 µL fluorescent PCR product; 0.5µL GeneScan™500 LIZ; 8.5 µL Hi-Di™ Formamide, run the SSR sample analysis assay on the ABI 3730xl. The original genotype data derived from ABI 3730XL were then transferred into GenAlEx version 6.501 software [39] to calculate the various genetic diversity indicators of SSR loci and populations, the observed allele (Na), effective allele (Ne), Shannon index (I), polymorphism information index (PIC), observed heterozygosity (Ho), expected heterozygosity (He) and inbreeding coefficient (Fis) were included.

Molecular Variance Analysis (AMOVA) and Gene Flow Estimation
The original genotype data of Shanlan upland rice and upland rice were used to calculate the variation, differentiation, and significance test in GenAlEx version 6.501 software. Gene flow (Nm) was calculated based on the genetic differentiation coefficient (Fst) obtained from GenAlEx version 6.501 [40]:

Data Processing
Data were collated and analyzed using Excel 2019, and UPGMA clustering and mapping were performed by using R (version 4.0.0). The Nei's genetic distances (Supplementary Files S003 and S004) were clustered in R using the "hclust". Tree graphs are drawn using the R package "ggtree".

Conclusions
In this study, SSR markers were used to analyze the genetic diversity of Shanlan upland rice. 12 pairs of polymorphic primers were screened from 214 materials from different regions. In our results, we found that the genetic differences of upland rice in different regions mainly existed between individuals, and the population of Shanlan upland rice from Hainan (China) also showed different genetic differences, and the genetic diversity analysis showed that it may came from rice in Guangdong (China) and Hunan (China). This study provides some theoretical support for further exploration of the Li nationality origin from the perspective of rice seed source.